Variables
Numeric
- Num, Year, Released, Runtime, Metascore, imdbRating, imdbVote,
imdbID
Text
- Title, Genre, Director, Writer, Actor, Plot, Language, Country,
Awards, Type, DVD, BoxOffice, Production, Websites
DataMunging Pt1
Type Modification
- Runtime -> “123min”
- BoxOffice -> “$1,234,567”
- imdbVoting -> “1,234,567”
DataMunging Pt2
Data Split
- Genre -> “Drama, Action”
- Actor -> “Tim Robbins, Morgan Freeman”
- Writer -> “Mario Puzo (screenplay)”
- Language -> “English, Italian, Latin”
- Country -> “USA, UK”
- Awards -> “Won 3 Oscars. Another 23 wins & 27
nominations.”
DataMunging Pt3
Removed
- Released
- Plot
- imdbID
- Type
- DVD
- Website
DataMunging Pt4
Consistency and Completeness
## Num Title Year Runtime Director Writer
## 1.000 1.000 1.000 1.000 1.000 0.996
## Metascore imdbRating imdbVotes BoxOffice Production First Actor
## 0.708 1.000 1.000 0.300 1.000 1.000
## Second Actor Third Actor Fourth Actor Awards Nominations First Genre
## 1.000 1.000 1.000 1.000 1.000 1.000
## Second Genre Third Genre l1 l2 l3 l4
## 0.908 0.632 1.000 0.496 0.224 0.088
## l5 l6 l7 c1 c2 c3
## 0.032 0.012 0.004 1.000 0.296 0.088
## c4 c5 c6 c7 c8 c9
## 0.024 0.012 0.004 0.004 0.004 0.004
Exploratory Data Analysis
1D Analysis
- Observe characteristics of single variable
- Present all included variables
Year
- Mean: 1982.676, SD: 24.80921

Runtime
- Mean: 126.808, SD: 29.76338

Directors Pt2
## x freq
## 4 Alfred Hitchcock 9
## 9 Billy Wilder 7
## 14 Charles Chaplin 5
## 16 Christopher Nolan 7
## 36 Frank Capra 4
## 86 Martin Scorsese 7
## 108 Quentin Tarantino 4
## 115 Ridley Scott 4
## 134 Stanley Kubrick 8
## 136 Steven Spielberg 7
Writers Pt1
## x freq
## 27 Charles Chaplin 5
## 158 Quentin Tarantino 4
## 185 Stanley Kubrick 6
## 187 Stephen King 4
imbdRating
- Mean: 8.244, SD: 0.2457347

imbdVotes
- Mean: 431401.3, SD: 367703.3

boxoffice
- Mean: 158727934, SD: 169133912

Production Pt2
## x freq
## 1 20th Century Fox 15
## 9 Buena Vista Pictures 4
## 11 Columbia Pictures 10
## 41 MGM 9
## 45 Miramax Films 7
## 46 New Line Cinema 6
## 55 Paramount Pictures 17
## 62 Sony Pictures 5
## 72 Twentieth Century Fox Home Entertainment 4
## 73 United Artists 15
## 78 Universal Pictures 14
## 81 Walt Disney Pictures 8
## 82 Warner Bros. 10
## 83 Warner Bros. Pictures 27
Actors and Actress Pt1
## x freq
## 10 Al Pacino 4
## 97 Carrie Fisher 4
## 101 Cary Grant 5
## 106 Charles Chaplin 4
## 120 Christian Bale 4
## 281 Harrison Ford 7
Awards Pt1
- Mean: 29.44, SD: 40.12799

Awards Pt2
- 28 movies with 75 awards or more
## Title Awards imdbRating
## 1: The Dark Knight 153 9.0
## 2: Schindler's List 78 8.9
## 3: The Lord of the Rings: The Return of the King 208 8.9
## 4: The Lord of the Rings: The Fellowship of the Ring 117 8.8
## 5: Inception 154 8.8
## 6: The Lord of the Rings: The Two Towers 120 8.7
Nominations Pt1
- Mean: 40.308, SD: 52.18792

Nominations Pt2
- 25 movies with 120 nominations or more
## Title Nominations imdbRating
## 1: The Dark Knight 153 9.0
## 2: The Lord of the Rings: The Return of the King 122 8.9
## 3: The Lord of the Rings: The Fellowship of the Ring 124 8.8
## 4: Inception 203 8.8
## 5: The Lord of the Rings: The Two Towers 138 8.7
## 6: Interstellar 142 8.6
Awards and Nominations Pt1
- 34 movies with 75+ awards or 120+ nominations
## Title Awards Nominations
## 1: The Dark Knight 153 153
## 2: Schindler's List 78 33
## 3: The Lord of the Rings: The Return of the King 208 122
## 4: The Lord of the Rings: The Fellowship of the Ring 117 124
## 5: Inception 154 203
## 6: The Lord of the Rings: The Two Towers 120 138
## imdbRating
## 1: 9.0
## 2: 8.9
## 3: 8.9
## 4: 8.8
## 5: 8.8
## 6: 8.7
Awards and Nominations Pt2
- 19 movies with 75+ awards and 120+ nominations
## Title Awards Nominations
## 1: The Dark Knight 153 153
## 2: The Lord of the Rings: The Return of the King 208 122
## 3: The Lord of the Rings: The Fellowship of the Ring 117 124
## 4: Inception 154 203
## 5: The Lord of the Rings: The Two Towers 120 138
## 6: The Departed 96 134
## imdbRating
## 1: 9.0
## 2: 8.9
## 3: 8.8
## 4: 8.8
## 5: 8.7
## 6: 8.5
Awards and Nominations Pt3
- 9 movies with 75+ awards and 120- nominations
## Title Awards Nominations imdbRating
## 1: Schindler's List 78 33 8.9
## 2: Saving Private Ryan 79 74 8.6
## 3: WALL·E 91 90 8.4
## 4: American Beauty 108 98 8.4
## 5: L.A. Confidential 87 77 8.3
## 6: Up 76 82 8.3
Awards and Nominations Pt4
- 6 movies with 75- awards and 120+ nominations
## Title Awards Nominations imdbRating
## 1: Interstellar 42 142 8.6
## 2: Django Unchained 58 151 8.4
## 3: The Wolf of Wall Street 38 170 8.2
## 4: Gone Girl 64 177 8.1
## 5: The Imitation Game 45 150 8.1
## 6: The Martian 34 187 8.0
Language Pt1
## x freq
## 3 Arabic 8
## 5 Cantonese 4
## 8 English 250
## 10 French 41
## 11 German 32
## 18 Italian 17
## 19 Japanese 6
## 21 Latin 13
## 31 Russian 12
## 35 Spanish 31
## 40 Vietnamese 4
Country Pt1
## x freq
## 1 Australia 6
## 4 Canada 6
## 7 France 12
## 8 Germany 11
## 12 Ireland 4
## 13 Italy 4
## 27 UK 55
## 29 USA 233
Number of Awards, BoxOffice, and imbdRatings Pt1
Number of Awards, BoxOffice, and imbdRatings Pt2
Nominations and imdbRatings

Nominations and Year

Nominations and imdbRatings Pt3
IMDB Ratings and Production Company
Correlation Pt 1
## Year Runtime Metascore imdbRating imdbVotes
## Year 1.00000000 0.17938049 -0.34085225 0.04549597 0.53623097
## Runtime 0.17938049 1.00000000 -0.06702619 0.24776081 0.24974676
## Metascore -0.34085225 -0.06702619 1.00000000 0.17211994 -0.09800265
## imdbRating 0.04549597 0.24776081 0.17211994 1.00000000 0.65668014
## imdbVotes 0.53623097 0.24974676 -0.09800265 0.65668014 1.00000000
## BoxOffice 0.33354876 0.16853031 0.07890235 0.12349854 0.38711525
## Awards 0.47198940 0.19118929 0.30717283 0.19979238 0.44580066
## Nominations 0.61632678 0.18631483 0.11100696 0.08192601 0.47425298
## BoxOffice Awards Nominations
## Year 0.33354876 0.4719894 0.61632678
## Runtime 0.16853031 0.1911893 0.18631483
## Metascore 0.07890235 0.3071728 0.11100696
## imdbRating 0.12349854 0.1997924 0.08192601
## imdbVotes 0.38711525 0.4458007 0.47425298
## BoxOffice 1.00000000 0.1831188 0.20620724
## Awards 0.18311879 1.0000000 0.84812343
## Nominations 0.20620724 0.8481234 1.00000000
Correlation Pt 2

ImdbRating and Year
- Correlation Coefficient: 0.045

ImdbRating and Runtime
- Correlation Coefficient: 0.248

ImdbRating and ImdbVotes
- Correlation Coefficient: 0.657

ImdbRating and BoxOffice
- Correlation Coefficient: 0.657

ImdbRating and Awards
- Correlation Coefficient: 0.200

ImdbRating and Nominations
- Correlation Coefficient: 0.082

Questions?
Thank You